Proposal to provide the facility to set binary format output for specific OID's per session

  • Jump to comment-1
    davecramer@gmail.com2022-07-22T15:00:18+00:00
    Greetings, Jack Christensen the author of the go pgx driver had suggested Default result formats should be settable per session · Discussion #5 · postgresql-interfaces/enhancement-ideas (github.com) <https://github.com/postgresql-interfaces/enhancement-ideas/discussions/5> The JDBC driver has a similar problem and defers switching to binary format until a statement has been reused 5 times; at which point we create a named prepared statement and incur the overhead of an extra round trip for the DESCRIBE statement. Because the extra round trip generally negates any performance enhancements that receiving the data in binary format may provide, we avoid using binary and receive everything in text format until we are sure the extra trip is worth it. Connection pools further complicate the issue: We can't use named statements with connection pools since there is no binding of the connection to the client. As such in the JDBC driver we recommend turning off the ability to create a named statement and thus binary formats. As a proof of concept I provide the attached patch which implements the ability to specify which oids will be returned in binary format per session. IE set format_binary='20,21,25' for instance. After which the specified oids will be output in binary format if there is no describe statement or even using simpleQuery. Both the JDBC driver and the go driver can exploit this change with no changes. I haven't confirmed if other drivers would work without changes. Furthermore jackc/postgresql_simple_protocol_binary_format_bench (github.com) <https://github.com/jackc/postgresql_simple_protocol_binary_format_bench> suggests that there is a considerable performance benefit. To quote 'At 100 rows the text format takes 48% longer than the binary format.' Regards, Dave Cramer
    • Jump to comment-1
      horikyota.ntt@gmail.com2022-07-25T03:02:22+00:00
      At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> wrote in > As a proof of concept I provide the attached patch which implements the > ability to specify which oids will be returned in binary format per > session. ... > Both the JDBC driver and the go driver can exploit this change with no > changes. I haven't confirmed if other drivers would work without changes. I'm not sure about the needs of that, but binary exchange format is not the one that can be turned on ignoring the peer's capability. If JDBC driver wants some types be sent in binary format, it seems to be able to be specified in bind message. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
      • Jump to comment-1
        davecramer@gmail.com2022-07-25T09:57:26+00:00
        Dave Cramer On Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> > wrote in > > As a proof of concept I provide the attached patch which implements the > > ability to specify which oids will be returned in binary format per > > session. > ... > > Both the JDBC driver and the go driver can exploit this change with no > > changes. I haven't confirmed if other drivers would work without changes. > > I'm not sure about the needs of that, but binary exchange format is > not the one that can be turned on ignoring the peer's capability. I'm not sure what this means. The client is specifying which types it wants in binary format. > If > JDBC driver wants some types be sent in binary format, it seems to be > able to be specified in bind message. > To be clear it's not just the JDBC client; the original idea came from the author of go driver. And yes you can specify it in the bind message but you have to specify it in *every* bind message which pretty much negates any advantage you might get out of binary format due to the extra round trip. Regards, Dave > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center >
        • Jump to comment-1
          jack@jackchristensen.com2022-07-25T14:07:25+00:00
          On Mon, Jul 25, 2022 at 4:57 AM Dave Cramer <davecramer@gmail.com> wrote: > > Dave Cramer > > > On Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> > wrote: > >> At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> >> wrote in >> > As a proof of concept I provide the attached patch which implements the >> > ability to specify which oids will be returned in binary format per >> > session. >> ... >> > Both the JDBC driver and the go driver can exploit this change with no >> > changes. I haven't confirmed if other drivers would work without >> changes. >> >> I'm not sure about the needs of that, but binary exchange format is >> not the one that can be turned on ignoring the peer's capability. > > I'm not sure what this means. The client is specifying which types it > wants in binary format. > >> If >> JDBC driver wants some types be sent in binary format, it seems to be >> able to be specified in bind message. >> > To be clear it's not just the JDBC client; the original idea came from the > author of go driver. > And yes you can specify it in the bind message but you have to specify it > in *every* bind message which pretty much negates any advantage you might > get out of binary format due to the extra round trip. > > Regards, > Dave > >> >> regards. >> >> -- >> Kyotaro Horiguchi >> NTT Open Source Software Center >> > The advantage is to be able to use the binary format with only a single network round trip in cases where prepared statements are not possible. e.g. when using PgBouncer. Using the simple protocol with this patch lets users of pgx (the Go driver mentioned above) and PgBouncer use the binary format. The performance gains can be significant especially with types such as timestamptz that are very slow to parse. As far as only sending binary types that the client can understand, the client driver would call `set format_binary` at the beginning of the session. Jack Christensen
          • Jump to comment-1
            mail@joeconway.com2022-07-25T17:30:43+00:00
            On 7/25/22 10:07, Jack Christensen wrote: > The advantage is to be able to use the binary format with only a single > network round trip in cases where prepared statements are not possible. > e.g. when using PgBouncer. Using the simple protocol with this patch > lets users of pgx (the Go driver mentioned above) and PgBouncer use the > binary format. The performance gains can be significant especially with > types such as timestamptz that are very slow to parse. > > As far as only sending binary types that the client can understand, the > client driver would call `set format_binary` at the beginning of the > session. +1 makes a lot of sense to me. Dave please add this to the open commitfest (202209) -- Joe Conway RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
            • Jump to comment-1
              sehrope@jackdb.com2022-07-25T21:22:24+00:00
              Idea here makes sense and I've seen this brought up repeatedly on the JDBC lists. Does the driver need to be aware that this SET command was executed? I'm wondering what happens if an end user executes this with an OID the driver does not actually know how to handle. > + Oid *tmpOids = palloc(length+1); > ... > + tmpOids = repalloc(tmpOids, length+1); These should be: sizeof(Oid) * (length + 1) Also, I think you need to specify an explicit context via MemoryContextAlloc or the allocated memory will be in the default context and released at the end of the command. Regards, -- Sehrope Sarkuni Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
              • Jump to comment-1
                davecramer@gmail.com2022-07-25T21:53:10+00:00
                Hi Sehrope, On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote: > Idea here makes sense and I've seen this brought up repeatedly on the JDBC > lists. > > Does the driver need to be aware that this SET command was executed? I'm > wondering what happens if an end user executes this with an OID the driver > does not actually know how to handle. > I suppose there would be a failure to read the attribute correctly. > > > + Oid *tmpOids = palloc(length+1); > > ... > > + tmpOids = repalloc(tmpOids, length+1); > > These should be: sizeof(Oid) * (length + 1) > Yes they should, thanks! > > Also, I think you need to specify an explicit context via > MemoryContextAlloc or the allocated memory will be in the default context > and released at the end of the command. > Also good catch Thanks, Dave >
                • Jump to comment-1
                  davecramer@gmail.com2022-07-26T12:11:04+00:00
                  Hi Sehrope, On Mon, 25 Jul 2022 at 17:53, Dave Cramer <davecramer@gmail.com> wrote: > Hi Sehrope, > > > On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote: > >> Idea here makes sense and I've seen this brought up repeatedly on the >> JDBC lists. >> >> Does the driver need to be aware that this SET command was executed? I'm >> wondering what happens if an end user executes this with an OID the driver >> does not actually know how to handle. >> > I suppose there would be a failure to read the attribute correctly. > >> >> > + Oid *tmpOids = palloc(length+1); >> > ... >> > + tmpOids = repalloc(tmpOids, length+1); >> >> These should be: sizeof(Oid) * (length + 1) >> > > Yes they should, thanks! > >> >> Also, I think you need to specify an explicit context via >> MemoryContextAlloc or the allocated memory will be in the default context >> and released at the end of the command. >> > > Also good catch > > Thanks, > Attached patch to correct these deficiencies. Thanks again, > > Dave > >>
                  • Jump to comment-1
                    pryzby@telsasoft.com2022-08-05T21:51:39+00:00
                    On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote: > Attached patch to correct these deficiencies. You sent a patch to be applied on top of the first patch, but cfbot doesn't know that, so it says the patch doesn't apply. http://cfbot.cputube.org/dave-cramer.html BTW, a previous discussion about this idea is here: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com -- Justin